Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Ozay, Necmiye; Balzano, Laura; Panagou, Dimitra; Abate, Alessandro (Ed.)The pursuit of robustness has recently been a popular topic in reinforcement learning (RL) research, yet the existing methods generally suffer from computation issues that obstruct their real-world implementation. In this paper, we consider MDPs with low-rank structures, where the transition kernel can be written as a linear product of feature map and factors. We introduce *duple perturbation* robustness, i.e. perturbation on both the feature map and the factors, via a novel characterization of (𝜉,𝜂) -ambiguity sets featuring computational efficiency. Our novel low-rank robust MDP formulation is compatible with the low-rank function representation view, and therefore, is naturally applicable to practical RL problems with large or even continuous state-action spaces. Meanwhile, it also gives rise to a provably efficient and practical algorithm with theoretical convergence rate guarantee. Lastly, the robustness of our proposed approach is justified by numerical experiments, including classical control tasks with continuous state-action spaces.more » « lessFree, publicly-accessible full text available June 4, 2026
- 
            Ozay, Necmiye; Balzano, Laura; Panagou, Dimitra; Abate, Alessandro (Ed.)Free, publicly-accessible full text available June 4, 2026
- 
            Ozay, Necmiye; Balzano, Laura; Panagou, Dimitra; Abate, Alessandro (Ed.)Many optimal and robust control problems are nonconvex and potentially nonsmooth in their policy optimization forms. In this paper, we introduce the Extended Convex Lifting (ECL) framework, which reveals hidden convexity in classical optimal and robust control problems from a modern optimization perspective. Our ECL framework offers a bridge between nonconvex policy optimization and convex reformulations. Despite non-convexity and non-smoothness, the existence of an ECL for policy optimization not only reveals that the policy optimization problem is equivalent to a convex problem, but also certifies a class of first-order non-degenerate stationary points to be globally optimal. We further show that this ECL framework encompasses many benchmark control problems, including LQR, state-feedback and output-feedback H-infinity robust control. We believe that ECL will also be of independent interest for analyzing nonconvex problems beyond control.more » « lessFree, publicly-accessible full text available June 4, 2026
- 
            Abate, Alessandro; Cannon, Mark; Margellos, Kostas; Papachristodoulou, Antonis (Ed.)In this paper, we introduce a data-driven framework for synthesis of provably-correct controllers for general nonlinear switched systems under complex specifications. The focus is on systems with unknown disturbances whose effects on the dynamics of the system is nonlinear. The specification is assumed to be given as linear temporal logic over finite traces (LTLf) formulas. Starting from observations of either the disturbance or the state of the system, we first learn an ambiguity set that contains the unknown distribution of the disturbances with a user-defined confidence. Next, we obtain a robust Markov decision process (RMDP) as a finite abstraction of the system. By composing the RMDP with the automaton obtained from the LTLf formula and performing optimal robust value iteration on the composed RMDP, we synthesize a strategy that yields a high probability that the uncertain system satisfies the specifications. Our empirical evaluations on systems with a wide variety of disturbances show that the strategies synthesized with our approach lead to high satisfaction probabilities and validate the theoretical guarantees.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available